Goto

Collaborating Authors

 Pima County


Arboretum: A Large Multimodal Dataset Enabling AI for Biodiversity (Supplemental Material)

Neural Information Processing Systems

Arboretum is a 134.6M sample dataset designed to advance AI for biodiversity applications by providing a large-scale, accurately annotated multimodal dataset that includes images and corresponding textual descriptions for a diverse set of species. Arboretum aims to facilitate the development of AI models for species identification, ecological monitoring, and agricultural research. Additionally, we introduce three new benchmark datasets: Arboretum-Unseen, Arboretum-LifeStages, and Arboretum-Balanced. As the authors of this submission, we affirm that we bear all responsibility in case of any rights violations or ethical issues associated with this work. We confirm that the submitted work is original, and if it includes third-party content, it is used with proper permissions and attributions.


BioTrove: A Large Curated Image Dataset Enabling AI for Biodiversity

Neural Information Processing Systems

We introduce BioTrove, the largest publicly accessible dataset designed to advance AI applications in biodiversity. Curated from the iNaturalist platform and vetted to include only research-grade data, BioTrove contains 161.9 million images, offering unprecedented scale and diversity from three primary kingdoms: Animalia ("animals"), Fungi ("fungi"), and Plantae ("plants"), spanning approximately 366.6K species. Each image is annotated with scientific names, taxonomic hierarchies, and common names, providing rich metadata to support accurate AI model development across diverse species and ecosystems. We demonstrate the value of BioTrove by releasing a suite of CLIP models trained using a subset of 40 million captioned images, known as BioTrove-Train. This subset focuses on seven categories within the dataset that are underrepresented in standard image recognition models, selected for their critical role in biodiversity and agriculture: Aves ("birds"), Arachnida ("spiders/ticks/mites"), Insecta ("insects"), Plantae ("plants"), Fungi ("fungi"), Mollusca ("snails"), and Reptilia ("snakes/lizards"). To support rigorous assessment, we introduce several new benchmarks and report model accuracy for zero-shot learning across life stages, rare species, confounding species, and multiple taxonomic levels. We anticipate that BioTrove will spur the development of AI models capable of supporting digital tools for pest control, crop monitoring, biodiversity assessment, and environmental conservation. These advancements are crucial for ensuring food security, preserving ecosystems, and mitigating the impacts of climate change. BioTrove is publicly available, easily accessible, and ready for immediate use.


Domain Adaptation Framework for Turning Movement Count Estimation with Limited Data

arXiv.org Artificial Intelligence

Urban transportation networks are vital for the efficient movement of people and goods, necessitating effective traffic management and planning. An integral part of traffic management is understanding the turning movement counts (TMCs) at intersections, Accurate TMCs at intersections are crucial for traffic signal control, congestion mitigation, and road safety. In general, TMCs are obtained using physical sensors installed at intersections, but this approach can be cost-prohibitive and technically challenging, especially for cities with extensive road networks. Recent advancements in machine learning and data-driven approaches have offered promising alternatives for estimating TMCs. Traffic patterns can vary significantly across different intersections due to factors such as road geometry, traffic signal settings, and local driver behaviors. This domain discrepancy limits the generalizability and accuracy of machine learning models when applied to new or unseen intersections. In response to these limitations, this research proposes a novel framework leveraging domain adaptation (DA) to estimate TMCs at intersections by using traffic controller event-based data, road infrastructure data, and point-of-interest (POI) data. Evaluated on 30 intersections in Tucson, Arizona, the performance of the proposed DA framework was compared with state-of-the-art models and achieved the lowest values in terms of Mean Absolute Error and Root Mean Square Error.


Active management of battery degradation in wireless sensor network using deep reinforcement learning for group battery replacement

arXiv.org Artificial Intelligence

Wireless sensor networks (WSNs) have become a promising solution for structural health monitoring (SHM), especially in hard-to-reach or remote locations. Battery-powered WSNs offer various advantages over wired systems, however limited battery life has always been one of the biggest obstacles in practical use of the WSNs, regardless of energy harvesting methods. While various methods have been studied for battery health management, existing methods exclusively aim to extend lifetime of individual batteries, lacking a system level view. A consequence of applying such methods is that batteries in a WSN tend to fail at different times, posing significant difficulty on planning and scheduling of battery replacement trip. This study investigate a deep reinforcement learning (DRL) method for active battery degradation management by optimizing duty cycle of WSNs at the system level. This active management strategy effectively reduces earlier failure of battery individuals which enable group replacement without sacrificing WSN performances. A simulated environment based on a real-world WSN setup was developed to train a DRL agent and learn optimal duty cycle strategies. The performance of the strategy was validated in a long-term setup with various network sizes, demonstrating its efficiency and scalability.


Sampling Decisions

arXiv.org Machine Learning

In this manuscript we introduce a novel Decision Flow (DF) framework for sampling from a target distribution while incorporating additional guidance from a prior sampler. DF can be viewed as an AI driven algorithmic reincarnation of the Markov Decision Process (MDP) approach in Stochastic Optimal Control. It extends the continuous space, continuous time path Integral Diffusion sampling technique to discrete time and space, while also generalizing the Generative Flow Network framework. In its most basic form, an explicit, Neural Network (NN) free formulation, DF leverages the linear solvability of the the underlying MDP to adjust the transition probabilities of the prior sampler. The resulting Markov Process is expressed as a convolution of the reverse time Green's function of the prior sampling with the target distribution. We illustrate the DF framework through an example of sampling from the Ising model, discuss potential NN based extensions, and outline how DF can enhance guided sampling across various applications.


Say Less, Mean More: Leveraging Pragmatics in Retrieval-Augmented Generation

arXiv.org Artificial Intelligence

We propose a simple, unsupervised method that injects pragmatic principles in retrieval-augmented generation (RAG) frameworks such as Dense Passage Retrieval to enhance the utility of retrieved contexts. Our approach first identifies which sentences in a pool of documents retrieved by RAG are most relevant to the question at hand, cover all the topics addressed in the input question and no more, and then highlights these sentences within their context, before they are provided to the LLM, without truncating or altering the context in any other way. We show that this simple idea brings consistent improvements in experiments on three question answering tasks (ARC-Challenge, PubHealth and PopQA) using five different LLMs. It notably enhances relative accuracy by up to 19.7% on PubHealth and 10% on ARC-Challenge compared to a conventional RAG system.


Personalized Education with Generative AI and Digital Twins: VR, RAG, and Zero-Shot Sentiment Analysis for Industry 4.0 Workforce Development

arXiv.org Artificial Intelligence

While the advent of the Fourth Industrial Revolution (4IR) technologies, like cloud computing, machine learning, and artificial intelligence have brought convenience and productivity improvements, they have also introduced new challenges in training and education that require the reskilling of existing employees and the building of a new workforce. Exacerbated by the already existing workforce shortages, this mammoth workforce reskilling and building effort aims to build a high-tech workforce capable of operating and maintaining these 4IR systems; requiring a higher student retention and persistence. This increase in student retention and persistence will be especially critical when training the workforce originating from marginalized communities like Underrepresented Minorities (URM), where challenges arise due to lack of access to high-quality education throughout the trainee's formative years (pre/middle/high schools), creating a cyclic set of knowledge dependencies that are difficult to meet. To address these challenges, this research presents Generative AI-based Personalized Tutor for Industrial 4.0 (gAI-PT4I4), a framework that focuses on personalization of 4IR experiential learning, using sentiment analysis to gauge student's knowledge comprehension, while using a combination of generative AI and finite automaton to personalize the content to the students' learning needs. The framework administers experiential learning, using low-fidelity Digital Twins that enable virtual reality-based (VR) training exercises focusing on 4IR training. The VR environment, integrates a generative AI teaching assistant called the Interactive Tutor, that guides the student through the training exercises, with audio and text communications.


Multi-Agent Actor-Critic Generative AI for Query Resolution and Analysis

arXiv.org Artificial Intelligence

In this paper, we introduce MASQRAD (Multi-Agent Strategic Query Resolution and Diagnostic tool), a transformative framework for query resolution based on the actor-critic model, which utilizes multiple generative AI agents. MASQRAD is excellent at translating imprecise or ambiguous user inquiries into precise and actionable requests. This framework generates pertinent visualizations and responses to these focused queries, as well as thorough analyses and insightful interpretations for users. MASQRAD addresses the common shortcomings of existing solutions in domains that demand fast and precise data interpretation, such as their incapacity to successfully apply AI for generating actionable insights and their challenges with the inherent ambiguity of user queries. MASQRAD functions as a sophisticated multi-agent system but "masquerades" to users as a single AI entity, which lowers errors and enhances data interaction. This approach makes use of three primary AI agents: Actor Generative AI, Critic Generative AI, and Expert Analysis Generative AI. Each is crucial for creating, enhancing, and evaluating data interactions. The Actor AI generates Python scripts to generate data visualizations from large datasets within operational constraints, and the Critic AI rigorously refines these scripts through multi-agent debate. Finally, the Expert Analysis AI contextualizes the outcomes to aid in decision-making. With an accuracy rate of 87\% when handling tasks related to natural language visualization, MASQRAD establishes new benchmarks for automated data interpretation and showcases a noteworthy advancement that has the potential to revolutionize AI-driven applications.


Speeding up Speculative Decoding via Approximate Verification

arXiv.org Artificial Intelligence

Speculative Decoding (SD) is a recently proposed technique for faster inference using Large Language Models (LLMs). SD operates by using a smaller draft LLM for autoregressively generating a sequence of tokens and a larger target LLM for parallel verification to ensure statistical consistency. However, periodic parallel calls to the target LLM for verification prevent SD from achieving even lower latencies. We propose SPRINTER, which utilizes a low-complexity verifier trained to predict if tokens generated from a draft LLM would be accepted by the target LLM. By performing approximate sequential verification, SPRINTER does not require verification by the target LLM and is only invoked when a token is deemed unacceptable. This leads to reducing the number of calls to the larger LLM and can achieve further speedups. We present a theoretical analysis of SPRINTER, examining the statistical properties of the generated tokens, as well as the expected reduction in latency as a function of the verifier. We evaluate SPRINTER on several datasets and model pairs, demonstrating that approximate verification can still maintain high quality generation while further reducing latency. For instance, on Wiki-Summaries dataset, SPRINTER achieves a 1.7x latency speedup and requires 8.3x fewer flops relative to SD, while still generating high-quality responses when using GPT2-Small and GPT2-XL as draft/target models.


Containment Control Approach for Steering Opinion in a Social Network

arXiv.org Artificial Intelligence

The paper studies the problem of steering multi-dimensional opinion in a social network. Assuming the society of desire consists of stubborn and regular agents, stubborn agents are considered as leaders who specify the desired opinion distribution as a distributed reward or utility function. In this context, each regular agent is seen as a follower, updating its bias on the initial opinion and influence weights by averaging their observations of the rewards their influencers have received. Assuming random graphs with reducible and irreducible topology specify the influences on regular agents, opinion evolution is represented as a containment control problem in which stability and convergence to the final opinion are proven.